DNN-Based Voice Activity Detection with Multi-Task Learning
نویسندگان
چکیده
منابع مشابه
DNN-based Causal Voice Activity Detector
Voice Activity Detectors (VAD) are important components in audio processing algorithms. In general, VADs are two way classifiers, flagging the audio frames where we have voice activity. Most of them are based on the signal energy and build statistical models of the noise background and the speech signal. In the process of derivation, we are limited to simplified statistical models and this limi...
متن کاملFusion of multiple parameterisations for DNN-based sinusoidal speech synthesis with multi-task learning
It has recently been shown that deep neural networks (DNN) can improve the quality of statistical parametric speech synthesis (SPSS) when using a source-filter vocoder. Our own previous work has furthermore shown that a dynamic sinusoidal model (DSM) is also highly suited to DNN-based SPSS, whereby sinusoids may either be used themselves as a “direct parameterisation” (DIR), or they may be enco...
متن کاملCHiME4: Multichannel Enhancement Using Beamforming Driven by DNN-based Voice Activity Detection
In this work, we focus on methods for enhancing the sixchannel CHiME4 data using beamforming that is driven by voice activity detectors (VAD). We propose two beamformers and two VADs that are based on trained deep neural networks (DNN). Their combinations are compared when used as frontends whose outputs are forwarded to the baseline automatic speech recognition system. Results in term of Word-...
متن کاملMulti-Channel Voice Activity Detection Based on Conic Constraints
Unlike single microphone techniques for voice activity detection (VAD), multi-microphone signal processing usually exploits the spatial information of signals received at multiple microphones. In this paper, we propose a VAD algorithm based on conic constraints to achieve robustness against the direction of arrival (DOA) estimation error. The proposed algorithm uses the phase vector as feature ...
متن کاملMulti-Task Learning and Weighted Cross-Entropy for DNN-Based Keyword Spotting
We propose improved Deep Neural Network (DNN) training loss functions for more accurate single keyword spotting on resource-constrained embedded devices. The loss function modifications consist of a combination of multi-task training and weighted cross entropy. In the multi-task architecture, the keyword DNN acoustic model is trained with two tasks in parallel the main task of predicting the ke...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEICE Transactions on Information and Systems
سال: 2016
ISSN: 0916-8532,1745-1361
DOI: 10.1587/transinf.2015edl8168